379 research outputs found

    Dictionary writing system (DWS) plus corpus query package (CQP): the case of TshwaneLex

    Get PDF
    In this article the integrated corpus query functionality of the dictionary compilation software TshwanelLex is analysed. Attention is given to the handling of both raw corpus data and annotated corpus data. With regard to the latter it is shown how, with a minimum of human effort, machine learning techniques can be employed to obtain part-of-speech tagged corpora that can be used for lexicographic purposes. All points are illustrated with data drawn from English and Northern Sotho. The tools and techniques themselves, however, are language-independent, and as Such the encouraging outcomes of this study are far-reaching

    A corpus-based survey of four electronic Swahili-English Bilingual dictionaries

    Get PDF
    In this article we survey four different electronic bilingual dictionaries for the language pair Swahili-English. Aided by a data-driven morphological analyzer and part-of-speech tagger, we quantify the coverage of the dictionaries on large monolingual corpora of Swahili. In a second series of experiments, we investigate how applicable the dictionaries are as a tool in the development of a machine translation system, by evaluating bilingual coverage on the parallel SAWA corpus. At the same time we attempt to consolidate the dictionaries into a unified lexicographic database and compare the coverage to that of its composite parts

    Automatic Detection of Online Jihadist Hate Speech

    Full text link
    We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning. The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016. We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine the network of Twitter users, outline the technical procedure used to train the system, and discuss examples of use.Comment: 31 page

    Lexikos 18 (AFRILEX-reeks/series 18: 2008): 303-318 Improving the Computational Morphological Analysis of a Swahili Corpus for Lexicographic Purposes *

    Get PDF
    Abstract: Computational morphological analysis is an important first step in the automatic treatment of natural language and a useful lexicographic tool. This article describes a corpus-based approach to the morphological analysis of Swahili. We particularly focus our discussion on its ability to retrieve lemmas for word forms and evaluate it as a tool for corpus-based dictionary compilation

    Orthodontic management of a migrated maxillary central incisor with a secondary occlusal trauma

    No full text
    Introduction: Normal or excessive occlusal forces exerted on teeth with a reduced periodontal support might result in a secondary occlusal trauma. This type of injury is diagnosed based on histological changes in the periodontium. Multiple clinical and radiographic indicators are, therefore, required as surrogates to assist the presumptive diagnosis of a (secondary) occlusal trauma. Case Presentation: In this case report, the diagnosis, management, and the 1-year follow-up of a secondary occlusal trauma of a maxillary central incisor are described. The occlusal relationship was rehabilitated with fixed orthodontic appliances and was further stabilized with both fixed and removable retainers. Conclusions: A combined periodontal-orthodontic approach for a secondary occlusal trauma allows the rehabilitation of periodontal, occlusal, and esthetic parameters. Twelve months after the end of the active orthodontic treatment, a combination of fixed and removable retainers showed to be effective in retaining the treatment outcome
    corecore